Weakly Supervised Critical Nugget Finding Algorithm for Improving Classification Accuracy
نویسنده
چکیده
Accuracy is more important in data classification. An elimination of noisy data, attribute and property discovery is a major consideration in the proposed research. From the overall given population the system predicts the nuggets effectively. The subpopulation and exceptional property pair which is known as outliers. With the aim of effective critical nuggets detection, the proposed WS-CNF algorithm applies a provisional model which identifies the exceptional property pair with the scoring method implementation. Several outlier detection methods have been introduced with certain domains and applications, but they were more generic and affected by subset detection problem. The proposed concept effectively implements DBD (Data Boundary Detection) model based approach which is used for improving the classification accuracy by extending the boundary values by various iterations, the collection of these have named as Weakly Supervised Critical Nugget Finding algorithm and primary direction algorithm for the detection of sub population scores for both numerical and categorical datasets. Also the system performs the classification method in order to find best class based on the score and label. Finally, the proposed algorithm can reduce the computation cost and lack of accuracy problem by applying best data mining and suitable pruning techniques. The experiments and the results provide the mild and extreme outlier ranges with score values. Keywords—Outliers, Outlier detection, data mining, classification accuracy, Principle component analysis.
منابع مشابه
Determination of Best Supervised Classification Algorithm for Land Use Maps using Satellite Images (Case Study: Baft, Kerman Province, Iran)
According to the fundamental goal of remote sensing technology, the image classification of desired sensors can be introduced as the most important part of satellite image interpretation. There exist various algorithms in relation to the supervised land use classification that the most pertinent one should be determined. Therefore, this study has been conducted to determine the best and most su...
متن کاملA Novel Scheme for Improving Accuracy of KNN Classification Algorithm Based on the New Weighting Technique and Stepwise Feature Selection
K nearest neighbor algorithm is one of the most frequently used techniques in data mining for its integrity and performance. Though the KNN algorithm is highly effective in many cases, it has some essential deficiencies, which affects the classification accuracy of the algorithm. First, the effectiveness of the algorithm is affected by redundant and irrelevant features. Furthermore, this algori...
متن کاملSFLA Based Gene Selection Approach for Improving Cancer Classification Accuracy
In this paper, we propose a new gene selection algorithm based on Shuffled Frog Leaping Algorithm that is called SFLA-FS. The proposed algorithm is used for improving cancer classification accuracy. Most of the biological datasets such as cancer datasets have a large number of genes and few samples. However, most of these genes are not usable in some tasks for example in cancer classification....
متن کاملImproved Arabic Dialect Classification with Social Media Data
Arabic dialect classification has been an important and challenging problem for Arabic language processing, especially for social media text analysis and machine translation. In this paper we propose an approach to improving Arabic dialect classification with semi-supervised learning: multiple classifiers are trained with weakly supervised, strongly supervised, and unsupervised data. Their comb...
متن کاملImproving Classification Accuracy Using Code Migration
Classification is a data mining technique widely used in critical domains like financial risk analysis, biology, communication network management, etc. Classification accuracy and learning from distributed datasets are the most challenging topics in the field of supervised learning. In this paper, we first briefly review the background of parallel and distributed classification algorithms and t...
متن کامل